Machine learning is getting incorporated more and more into everyday apps. Soon, anything with a camera will use image recognition which is based on deep learning models.
The goal of the project is to build such a model. It is going to recognize flowers and I will make a simple application. Here in the notebook I am going to develop a neural network which I will export to a file. The application will use this model file to predict flower types on images.
import warnings
warnings.filterwarnings('ignore')
%matplotlib inline
%config InlineBackend.figure_format = 'retina'
import json
import numpy as np
import matplotlib.pyplot as plt
import tensorflow as tf
import tensorflow_datasets as tfds
import tensorflow_hub as hub
tfds.disable_progress_bar()
import logging
logger = tf.get_logger()
logger.setLevel(logging.ERROR)
# The new version of dataset is only available in the tfds-nightly package
%pip --no-cache-dir install tfds-nightly --user
# need to restart the kernel after it is run for the first time
The dataset comes from the University of Oxford and contains 8189 images of 102 flower types. It already has training, validation and test splits.
I am going to load it from TensorFlow Datasets collection.
%%capture
# Download data to default local directory "~/tensorflow_datasets"
!python -m tensorflow_datasets.scripts.download_and_prepare --register_checksums=True --datasets=oxford_flowers102
# Load the dataset with TensorFlow Datasets
dataset, dataset_info = tfds.load('oxford_flowers102', as_supervised = True, with_info = True)
# Create a training set, a validation set and a test set
training_set = dataset['train']
validation_set = dataset['validation']
test_set = dataset['test']
# Get the number of examples in each set from the dataset info
print("Number of examples:")
num_training_examples = dataset_info.splits['train'].num_examples
print("- training", num_training_examples)
num_validation_examples = dataset_info.splits['validation'].num_examples
print("- validation", num_validation_examples)
num_test_examples = dataset_info.splits['test'].num_examples
print("- test", num_test_examples)
# Get the number of classes in the dataset from the dataset info
num_classes = dataset_info.features['label'].num_classes
print("\nNumber of classes", num_classes)
# Print the shape and corresponding label of 3 images in the training set
print("Shape and label of the first 3 images:")
for image, label in training_set.take(3):
print('\u2022 shape', image.shape, '\n\u0020 label', label.numpy())
After looking at a few images in the dataset I notice that they are in different sizes. The neural net will work best if they will be of the same size. I am going to do it before feeding them to the model for training and prediction.
# Plot 1 image from the training set and its label
for image, label in training_set.take(1):
plot_title = label.numpy()
plt.imshow(image, cmap = plt.cm.binary)
plt.title(plot_title)
plt.colorbar()
plt.show()
That's how a raw image from the dataset looks like. You can see a numerical label on the top.
Let's load the flower names corresponding to the labels. Now each image is paired with a number from 1 to 102. To print names I will need to translate between these numerical labels and loaded names.
with open('label_map.json', 'r') as f:
class_names = json.load(f)
# Plot the image again, this time with corresponding class name
for image, label in training_set.take(1):
plot_title = class_names[str(label.numpy())]
plt.imshow(image, cmap = plt.cm.binary)
plt.title(plot_title)
plt.colorbar()
plt.show()
And here is the same image but now with the flower name.
Here I create the pipeline that will feed the images to the model. It is going to resize each image to 224 x 224 pixels. For convenience, it will also normalize the color range, so it ranges from 0 to 1 instead of 0 to 255.
There will be separate pipelines for training, validation and test sets. For efficiency I will make it supply the images in batches of 64.
BATCH_SIZE = 64
IMG_SHAPE = 224
# Normalize color range of an image from 0-255 to 0-1
def normalize(image, label):
image = tf.cast(image, tf.float32)
image /= 255
return image, label
# Resize image
def resize(image, label):
image = tf.image.resize(image, (IMG_SHAPE, IMG_SHAPE))
return image, label
# Apply the transformations and return a pipeline
def batchesOfImages(set):
return set.cache() \
.shuffle(num_training_examples // 4) \
.map(resize) \
.batch(BATCH_SIZE) \
.map(normalize) \
.prefetch(1)
# Create a pipeline for each set
training_batches = batchesOfImages(training_set)
validation_batches = batchesOfImages(validation_set)
test_batches = batchesOfImages(test_set)
Now is the time to build the model. Image classification is a challenging problem and training the network from scratch can take a lot of time and computational power. For this reason, I am going to use a pre-trained neural network. It is called transfer learning and is about taking the existing neural network that was trained for a similar task and tuning it for your problem.
I am going to use a network called MobileNet that comes from TensorFlow Hub repository. It has been trained on ImageNet, a massive dataset with over 1 million labeled images in 1000 categories. It has already been trained to be good at recognizing objects on images in general. I want to make use of it when building the image classifier for flowers.
Because MobileNet is a network trained for a similar task, it should be enough to just replace the output layer. The idea is that the inner of the network will stay the same, it will just concentrate on naming flowers instead of naming objects in general. Another thing to consider is the size of the training dataset. It has just 1020 images and it is not enough to accurately tune the weights of this network. So the best approach will be to keep the existing weights constant.
I will name the loaded model feature_extractor because what the inner of the network does is it decodes an image into a set of features.
# Load MobileNet
URL = "https://tfhub.dev/google/tf2-preview/mobilenet_v2/feature_vector/4"
feature_extractor = hub.KerasLayer(URL, input_shape=(IMG_SHAPE, IMG_SHAPE, 3))
# Set its weights constant
feature_extractor.trainable = False
Now I add the output layer. It has 102 output nodes for 102 flower names. During training the network will learn how to translate the image features into one of the flower types. It will signal probabilities in the corresponding output nodes.
E.g. It will signal the result of 0.9 in the output node corresponding to iris and the remaining 0.1 will be in some way distributed among the remaining nodes.
model = tf.keras.Sequential([
feature_extractor,
tf.keras.layers.Dense(102, activation = 'softmax')
])
model.summary()
# Train the model
model.compile(optimizer='adam',
loss='sparse_categorical_crossentropy',
metrics=['accuracy'])
EPOCHS = 10
history = model.fit(training_batches,
epochs = EPOCHS,
validation_data = validation_batches)
epochs = range(1, 11)
plt.subplot(1, 2, 1)
plt.plot(epochs, history.history['loss'], 'b--', label = 'training')
plt.plot(epochs, history.history['val_loss'], 'b', label = 'validation')
plt.legend()
plt.title("Loss in trainig and validation")
plt.xlabel('Epochs')
plt.ylabel('Loss')
plt.subplot(1, 2, 2)
plt.plot(epochs, history.history['accuracy'], 'g--', label = 'training')
plt.plot(epochs, history.history['val_accuracy'], 'g', label = 'validation')
plt.legend()
plt.title("Accuracy in training and validation")
plt.xlabel('Epochs')
plt.ylabel('Accuracy')
plt.tight_layout() # to make plots not overlap
plt.show()
Training result is pretty good at about 80% validation accuracy. Though the distance between the curves grows significantly which indicates overfitting.
Now let's see how the model would perform in practice. I am going to do a real test on a previously unseen data, the test set.
model.evaluate(test_batches)
Test accuracy and loss are not much different than in validation. The network predicts correctly at about 77% of the time.
Here I export the model to a HDF5 file.
# Save the model as a Keras model
save_filename = "image_classifier"
save_filepath = './{}.h5'.format(save_filename)
model.save(save_filepath)
And load it again to make sure that it works.
model = tf.keras.models.load_model(
save_filepath,
custom_objects = {'KerasLayer' : hub.KerasLayer} # model uses KerasLayer which is a part of hub.KerasLayer
)
model.summary()
By comparing this summary with the previous one you can see that it is the same model.
Now the network is trained and I can use it for prediction. I am going to create function named predict. It will take an image, a model, k value and will return a list of k most likely flower names with their probabilities.
The model was trained on resized images and it will predict on resized images as well. So I need apply the same steps as before, resize the image and normalize its colors, before I can pass it to the model.
# Create the process_image function
# Resize image to 224x224 and normalize the color range
def process_image(image):
image = tf.cast(image, tf.float32)
image = tf.image.resize(image, (IMG_SHAPE, IMG_SHAPE))
image /= 255
return image
Let's use a test image to see how it works.
from PIL import Image
image_path = './test-images/hard-leaved_pocket_orchid.jpg'
im = Image.open(image_path)
test_image = np.asarray(im)
processed_test_image = process_image(test_image)
fig, (ax1, ax2) = plt.subplots(figsize=(10,10), ncols=2)
ax1.imshow(test_image)
ax1.set_title('Original Image')
ax2.imshow(processed_test_image)
ax2.set_title('Processed Image')
plt.tight_layout()
plt.show()
And it does what it is supposed to do.
Here is the predict function. It takes a path to a raw image file and applies pre-processing before feeding it to the model.
# Create the predict function
def predict(image_path, model, top_k):
# Prepare image
image = Image.open(image_path)
image = np.asarray(image)
image = process_image(image)
# Add extra dimension expected by the model
image = np.expand_dims(image, axis = 0) # Change shape from (224, 224, 3) to (1, 224, 224, 3)
# Predict
predictions = model.predict(image)[0]
# Get indexes of top k predictions
sorted_indexes = np.argsort(predictions)[::-1] # Sort indexes, in reverse (descending) order
top_k_indexes = sorted_indexes[0:top_k]
# Extract probabilites and classes based on indexes
probabilities = [predictions[i] for i in top_k_indexes]
classes = [class_names[str(i + 1)] for i in top_k_indexes] # +1 because class_names key range is from 1 to 102
return probabilities, classes
Let's use the model and see how it performs. There are 4 sample images to classify:
To visualize the results, I'm going to plot the predictions next to the the image and its true label.
# Function plotting image and its true class against model's predictions
def display_prediction(image_path, true_class, model, top_k):
image = Image.open(image_path)
image = np.asarray(image)
probabilities, classes = predict(image_path, model, top_k)
fig, (ax1, ax2) = plt.subplots(figsize = (10, 10), ncols = 2)
ax1.imshow(image)
ax1.set_title(true_class)
ax1.axis('off')
y_ticks = np.arange(top_k)
ax2.barh(y_ticks, probabilities)
ax2.set_aspect(0.1)
ax2.set_yticks(y_ticks)
ax2.set_yticklabels(classes)
ax2.set_title('Class Probability')
ax2.set_xlim(0, 1.1)
ax2.invert_yaxis() # labels read top-to-bottom
plt.tight_layout()
# Test images data
test_image_paths = [
'./test-images/cautleya_spicata.jpg',
'./test-images/hard-leaved_pocket_orchid.jpg',
'./test-images/orange_dahlia.jpg',
'./test-images/wild_pansy.jpg'
]
test_image_true_classes = [
'cautleya spicata',
'hard-leaved pocket orchid',
'orange dahlia',
'wild pansy'
]
# Plot results for each image
for (image_path, true_class) in zip(test_image_paths, test_image_true_classes):
display_prediction(image_path, true_class, model, 5)
The results on sample images are promising. You can see that the model was correct each time. It was certain about two images (hard-leaved pocket orchid and wild pansy), quite confident about one image (cautleya spicata), but it barely recognized the flower type on the remaining image (orange dahlia).
!!jupyter nbconvert *.ipynb